AllergenFP: allergenicity prediction by descriptor fingerprints

نویسندگان

  • Ivan Dimitrov
  • Lyudmila Naneva
  • Irini A. Doytchinova
  • Ivan Bangov
چکیده

MOTIVATION Allergenicity, like antigenicity and immunogenicity, is a property encoded linearly and non-linearly, and therefore the alignment-based approaches are not able to identify this property unambiguously. A novel alignment-free descriptor-based fingerprint approach is presented here and applied to identify allergens and non-allergens. The approach was implemented into a four step algorithm. Initially, the protein sequences are described by amino acid principal properties as hydrophobicity, size, relative abundance, helix and β-strand forming propensities. Then, the generated strings of different length are converted into vectors with equal length by auto- and cross-covariance (ACC). The vectors were transformed into binary fingerprints and compared in terms of Tanimoto coefficient. RESULTS The approach was applied to a set of 2427 known allergens and 2427 non-allergens and identified correctly 88% of them with Matthews correlation coefficient of 0.759. The descriptor fingerprint approach presented here is universal. It could be applied for any classification problem in computational biology. The set of E-descriptors is able to capture the main structural and physicochemical properties of amino acids building the proteins. The ACC transformation overcomes the main problem in the alignment-based comparative studies arising from the different length of the aligned protein sequences. The conversion of protein ACC values into binary descriptor fingerprints allows similarity search and classification. AVAILABILITY AND IMPLEMENTATION The algorithm described in the present study was implemented in a specially designed Web site, named AllergenFP (FP stands for FingerPrint). AllergenFP is written in Python, with GIU in HTML. It is freely accessible at http://ddg-pharmfac.net/Allergen FP. CONTACT [email protected] or [email protected].

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Insubria QSAR PaDEL-Descriptor model for prediction of Esters toxicity in Daphnia magna

Insubria QSAR PaDEL-Descriptor model for prediction of Esters toxicity in Daphnia magna 1.2.Other related models: E.Papa, F. Battaini, P.Gramatica. Ranking of aquatic toxicity of esters modelled by QSAR, Chemosphere (58), 2005, 559-570.[9] 1.3.Software coding the model: [1]PaDEL-Descriptor 2.18 A software to calculate molecular descriptors and fingerprints http://padel.nus.edu.sg/software/padel...

متن کامل

Insubria QSPR PaDEL-Descriptor model for Vapor Pressure prediction of Polybrominated Diphenyl Ethers

Insubria QSPR PaDEL-Descriptor model for Vapor Pressure prediction of Polybrominated Diphenyl Ethers. 1.2.Other related models: E. Papa, S. Kovarich, P. Gramatica, 2009, Validation and Inspection of the Applicability Domain of QSPR Models for Physicochemical Properties of Polybrominated Diphenyl Ethers, QSAR & Comb.Sci. 28, 790-796. [9] 1.3.Software coding the model: [1]PaDEL-Descriptor 2.18 A ...

متن کامل

Insubria QSAR PaDEL-Descriptor model for prediction of Endocrine Disruptors Chemicals (EDC) Estrogen Receptor (ER)-binding affinity

Insubria QSAR PaDEL-Descriptor model for prediction of Endocrine Disruptors Chemicals (EDC) Estrogen Receptor (ER)-binding affinity. 1.2.Other related models: J.Li and P.Gramatica. The importance of molecular structures, endpoints’ values, and predictivity parameters in QSAR research: QSAR analysis of a series of estrogen receptor binders, Mol. Divers. 14, 2010, pp 687-696. [8] 1.3.Software cod...

متن کامل

Insubria QSPR PaDEL-Descriptor model for Melting Point prediction of Polybrominated Diphenyl Ethers

Insubria QSPR PaDEL-Descriptor model for Melting Point prediction of Polybrominated Diphenyl Ethers. 1.2.Other related models: E. Papa, S. Kovarich, P. Gramatica, 2009, Validation and Inspection of the Applicability Domain of QSPR Models for Physicochemical Properties of Polybrominated Diphenyl Ethers, QSAR & Comb.Sci. 28, 790-796. [10] 1.3.Software coding the model: [1]PaDEL-Descriptor 2.18 A ...

متن کامل

A database for allergenic proteins and tools for allergenicity prediction

UNLABELLED The AllergenPro database has developed a web-based system that will provide information about allergen in microbes, animals and plants. The database has three major parts and functions:(i) database list; (ii) allergen search; and (iii) allergenicity prediction. The database contains 2,434 allergens related information readily available in the database such as on allergens in rice mic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 30 6  شماره 

صفحات  -

تاریخ انتشار 2014